Climate Dynamics manuscript No. 

(will be inserted by the editor) 



Maximum- Entropy Weighting of Multiple Earth Climate 
Models 



Robert K. Niven 



o 

< 
0^ 



^ : 

Oh. 
I , 

o . 

<D ■ 

o 

•I— I 

^ ■ 
Oh- 



> 

in 

O 

o 



X 



8 April 2011; revised 30 June 2011; accepted 8 August 2011 

Abstract A maximum entropy-based framework is pre- 
sented for the synthesis of projections from multiple 
Earth climate models. This identifies the most represen- 
tative (most probable) model from a set of climate mod- 
els - as defined by specified constraints - eliminating 
the need to calculate the entire set. Two approaches are 
developed, based on individual climate models or en- 
sembles of models, subject to a single cost (energy) con- 
straint or competing cost-benefit constraints. A finite- 
time limit on the minimum cost of modifying a model 
synthesis framework, at finite rates of change, is also 
reported. 

Keywords climate model ■ maximum entropy 
method • Boltzmann principle ■ thermodynamics • 
cost-benefit analysis ■ finite-time information limit 



1 Introduction 

A major challenge facing humanity is the possibility of 
climate change due to human and/or natural forcings, 
and how best to respond in a rational and informed 
manner. To this end, detailed global circulation models 
(GCMs) have been developed to predict the behaviour 
of the Earth climate system (atmosphere and oceans), 
involving solution of the continuity, Navier-Stokes, an- 
gular momentum and energy equations and constitutive 
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relations over two- or three-dimensional domains, sub- 
ject to various initial and boundary conditions [ll 
These are run interrogatively to yield climate projec- 
tions - predictions as a function of future time - to ex- 
amine various forcing and response scenarios. However, 
a serious difficulty for policy-makers is the promulga- 
tion of multiple models by different research groups, 
due to different modelling priorities, assumptions and 
input parameters, and inherent difficulties in the con- 
struction of climate models, especially in the handling 
of coupled phenomena (e.g. humidity ,3]) and the need 
to dramatically reduce their computational complexity, 
necessitating a turbulence closure scheme. Even with 
the same (or similar) inputs, different models can pro- 
vide significantly different climate projections 0] . A ra- 
tional framework for the synthesis of such projections 
- which operates in a transparent and fully defensible 
manner - is urgently required, to avoid the lack of ob- 
jectivity of seemingly ad hoc amalgamations of projec- 
tions from different groups. 

Over the past century, maximum entropy (MaxEnt) 
methods have been developed for the construction of 
probabilistic models, initially in thermodynamics 
and subsequently for all probabilistic systems 0, 
Although imbued with several information-theoretic in- 
terpretations 0, H, M, the success of such models 
rests ultimately on the maximum vrobabilit]i (MaxProb) 
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12, 13, MQS, 16, 17 



rinciple of Boltzmann [^, |6 
18|: "a system can be represented by its most probable 
stated' This provides a probabilistic definition of the 
(relative) entropy function; 



f^rel = K\n] 



(1) 



where F is the governing probability of an observable 
realization (macrostate) of the system and K is a con- 
stant. The maximum of ^jrei thus coincides with that of 
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P. If the system can be represented by the aUocation of 
distinguishable balls (objects) to distinguishable boxes 
(categories), then P will satisfy the multinomial distri- 



bution [iiHIail: 



Prob(ni, q,,N) = NlU^ 



(2) 



where rii is the observed occupancy (number of balls) 
and Qi is the prior probability of the ith category, N 
is the total number of balls and s the number of cate- 
gories. Insertion of ([2]) into ^ with K = N~^, taking 
the asymptotic limit iV — !■ oo and rii/N — > pi, gives the 
(negative) KuUback-Leibler entropy function: 



Pi 



(3) 



Maximisation of ([3]) for a system which satisfies ([2]), 
subject to its constraints, is therefore equivalent to seek- 
ing its most probable realization in the asymptotic limit, 
subject to the same constraints. 

We therefore adopt a broader concept of 'entropy' 
than that normally used in the physical sciences. Cli- 
matologists will be familiar with the thermodynamic 
entropy S, which has a clearly defined meaning as the 
state function 5 = / SQ/T (Clausius) or S" = fclnW 
(Boltzmann), where SQ is the increment of heat en- 
tering a system, T is temperature, k is the Boltzmann 
constant and W is the number of microstates within a 
given realization (macrostate) of a system. Its rate of 
change is dS/dt, of which the excess (exported) compo- 
nent is commonl y t ermed the thermodynamic entropy 
production a [jollsills^. However, under the MaxProb 
or MaxEnt approach adopted here, entropy acquires a 
more fundamental meaning in terms of the probabilis- 
tic state space of a system, however defined. To empha- 
sise their generic character, such entropies are here de- 
noted i^. The thermodynamic entropy is in fact a special 
case of the generic, being derivable by the application 
of MaxEnt to an energetic system 

SliSliS- The 

ensuing analyses are based entirely on generic entropy 
functions, not necessarily related to S; that said, much 
of the underlying mathematical structure is identical. 

The aim of this study is to construct a framework 
for the synthesis of climate projections from multiple 
climate models, based on the MaxProb (hence Max- 
Ent) principle. By analogy with thermodynamics, two 
approaches are presented, involving constraints on the 
properties of individual climate models or of ensem- 
bles of climate models. In each case, the analysis iden- 
tifies the most representative (most probable) model 
from a set of climate models, circumventing the need 
to calculate the entire set. Other implications of these 



frameworks, which arise from the mathematical struc- 
ture given by Jaynes are examined. In addition, 
we report a curious finite-time limit on the minimum 
cost of varying the overall framework at specified rates 
of change, using a theorem from finite-time thermody- 
namics 122, (231, (24, |25|, |2^, [27 1 . 



2 Derivations 

Consider an individual Earth general climate model 
(GCM), composed of J separable computational com- 
ponents. Each component j = 1, J is executed by a 
single choice i{j) of algorithm, methodology or paradigm, 
from a total of possible choices. As shown in Figure 
[TJ this gives a combinatorial scheme in which an indi- 
vidual model is constructed from a set of unique choices 
i{j) G {1, We assume that all models are 

calculated using the same set of input parameters and 
assumptions 6; moreover, to accommodate variability 
or errors in 6, each model will yield a set or domain of 
climate projections, which can be explored by Monte 
Carlo analysis or by some other means. If we move be- 
yond the deterministic mindset that an individual cli- 
mate model must be the "correct" one, how should we 
weight the projection sets from different climate mod- 
els, to obtain a (statistical) picture of their merged sets 
of projections? One could simply combine an available 
set of model outputs using equal or assigned weighting 
factors, as suggested in [J|, but unless every possible 
combination has been computed, the resulting compos- 
ite model will be rather arbitrary. In addition, if the 
model space is infinite (or merely very large), it will be 
impossible to compute the composite model in the life- 
time of the universe (or in any reasonable time frame). 
Moreover, the use of equal weights does not allow the 
incorporation of additional constraints on the model 
space. We therefore propose a MaxProb-based (hence 
MaxEnt-based) framework for the weighting of multi- 
ple climate models, for which two distinct approaches 
are available. 



2.1 "Microcanonical" Framework 

We first construct a "microcanonical" climate model 
weighting framework, based on the properties of indi- 
vidual climate models. Extending the representation in 
Figure [TJ consider a single climate model shown in Fig- 
ure[2](a), in which we choose to rank each choice of algo- 
rithm or method i{j) by its cost or energy e^, indicat- 
ing (for example) the relative programming and com- 
putational cost of execution of this particular choice. 
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Fig. 1 Generic combinatorial representation of the climate 
model weighting framework, showing a single model composed 
of individual discrete choices of the i(j)th algorithm or method- 
ology for each model component j = 1, J. 
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(b) 

Fig. 2 Combinatorial representations of (a) the microcanonical 
framework, showing a single model composed of gij degenerate 
choices for each model component j = 1,..., J (ranked by 
energy level eij); and (b) the canonical framework, composed 
of an ensemble of A*^ amalgamated microcanonical models. Ball 
numbers denote the model index. 



Each energy level is considered to have the degen- 
eracy gij > 1, equal to the number of choices which 
share the same cost e^j. The ranking scheme there- 
fore accounts for, but does not distinguish between, 
choices of equal cost. Each level is taken to have 
the occupancy niij € {0,1} (the choices are unique). 
From simple probabilistic considerations 19|, 

Mm for 



equiprobable degenerate choices, the probability of a 
given choice is given by the reduced multinomial 



])(m) 



= Prob(m,|g„e) 



1 



n 



lij ! 



where m, 



Gj = X]i=i 9ij superscript fi denotes the micro- 
canonical framework. Eq. Q reduces to P^/^j = grj/Gj, 
where r(j) is the selected choice, but it is preferable 
to keep the m.y explicit using The probability of 
selecting a single overall model, assuming that the J 
components are independent, is therefore given by the 
"multi-multinomial" : 



{mij, ...,TO/(j)j }, 



(4) 



{gij, 



Prob(m|g, 0) = n Pi-ob(mj |gj, 6») 



9ij 



(5) 



where m and g are the respective matrices of m.^ and 
gij . Each model is subject to J constraints on the total 
occupancy within each component j: 



l,Vj 



(6) 



Assuming that the costs are additive over the J com- 
ponents, we can also include a constraint on the total 
cost E of running the overall model: 



J Hi) 



E 



(7) 



To determine the most probable or equilibrium model, 
given the above occupancy and total energy constraints, 
we should maximise ([5]) with respect to the unknowns 
{m,y }, subject to ©-([T]). From the Boltzmann defini- 
tion (IT|) with K = 1, this is equivalent to maximising 
the entropy: 



= inp(M) = ^|-lnGj-|-^(my ln(/,y-lnmij!)| 

(8) 



1=1 



subject to the same constraints. We again emphasise 
that ([5]) is defined on the space of climate models, and 
has no connection to the thermodynamic entropy S. 
If one adopts the Stirling [sif approximation In niij ! « 
rriij Inrriij —rriij for large rriij (in fact this is not strictly 
valid), ^ reduces to: 



InG, 



E(^ 

1=1 



In 



m. 



9ij 



(9) 
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Extremisation of (O subject to dS])-© yields the (mi- 
crocanonical) Boltzmann distribution at equilibrium: 



3-(Aoj + l)-AEeij _ 



z 



(m) 



(10) 



where * denotes the asymptotic (Stirling-approximate) 
extremum, Aqj and Xe are Lagrangian multipliers re- 
spectively for the allocation (0) and total energy ([7]) 



constraints, and Z-'^' = 



IS 

the jth microcanonical partition function. Eq. (|10p can 
be solved in conjunction with (O to calculate the pre- 
dicted occupancies m*^ . If the occupancies are restricted 
to discrete values {0, 1}, this will yield the choices 
of the optimal climate model, subject to the total en- 
ergy constraint E. In practice, numerical solution will 
typically give floating-point values of m*^ , which can be 
used as weighting factors with which to combine mul- 
tiple models of the same total energy E. 

As noted, since G {0, 1}, Stirling's approxima- 
tion does not strictly apply to the above analysis, and 
so ([TII)) is only an approximate solution. This can be 
addressed by directly maximising the non-asymptotic 
entropy ^ with respect to niij, sub ject to JP-B, giv- 



ing the equilibrium distribution [13|, [ij, |15|, [if 



(11) 



where # denotes the non-asymptotic extremum and 
■^^^{y) — V'^^ (y ~ 1) is the upper inverse of the function 
A[x) = 1), defined for convenience, in which tpix) 

is the digamma function. (Note (jlip can be written with 
additional terms in Gj and N [c.f. 16]; these are here 
incorporated in Aoj.) In this case, no explicit partition 
functions exist, and pip must be solved in conjunction 
with del)-©. This method wiU give more precise values 
of the optimal weighting factors rnfj, although in prac- 
tice, its numerical solution can be difficult. The non- 
asymptotic solution ([TT|) is itself an approximation to 
the true discrete MaxProb solution (with rriij G {0, 1}), 
which must be identified by a (computationally expen- 
sive) combinatorial search scheme. 

Example: The above framework can be demonstrated 
by a simple example, in which a climate model is con- 
structed from J = 3 components, with / = [3, 4, 3] 
choices of algorithm. The degeneracies and energy levels 
are taken as: 
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In this framework, more (degenerate) algorithms, and 
algorithms with a fourth energy level, are available for 
model component 2. For a total energy per model of 
E = 17 units, the inferred asymptotic (jlOp and non- 
asymptotic pip solutions are, respectively: 



m 

and 



0.2823 0.2250 0.2823' 
0.3650 0.2909 0.3650 
0.3527 0.2811 0.3527 
0.2031 
[0.1192, 1.0393, 0.119l[ 



(13) 



= 0.1455 



0.1950 0.1695 0.1950" 
0.4206 0.3883 0.4206 
0.3844 0.3532 0.3844 
0.0890 

[0.1494, 0.8755, 0.1494]^ , Af = 0.1460 



(14) 



All calculations were conducted in Maple 14. The equi- 
librium model should thus be constructed using the 
weights in m* (or, arguably, m^). In this example, al- 
gorithms of intermediate cost (the second energy level) 
have the highest weighting. Some difference is evident 
between the asymptotic and non- asymptotic solutions 
m and Massieu functions Ao, due to the small model 
space of this simplified example. The energy multipliers 
Ab of the two solutions are, however, quite similar. 



2.2 "Canonical" Framework I 

The foregoing methodology is mathematically sound 
and provides a formal framework for the combination 
of different climate models. It is, however, somewhat 
restrictive in that it only includes models of a single 
total energy E. It is possible to conduct the analysis 
at a higher "canonical" level - in the same manner as 
in thermodynamics - by the analysis of "systems of 
systems" , in this case involving ensembles of individ- 
ual climate models. This is shown in Figure [IJb), in 
which an ensemble is constructed by collecting a sam- 
ple (without replacement) of N individual models, and 
amalgamating the results. This can be represented by 
a combinatorial scheme in which distinguishable balls 
- labelled by the model index 2 G {1, ---tN} - are al- 
located to distinguishable levels z(j), again indicating 
choices of energy level with degeneracy Qij. This 
gives the occupancies Uij G {0, N} for each energy level 
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of the ensemble, which are connected to those for each 
model by: 



N 



N 



z = \ 

m 

4=1 



(15) 



Hi) N 

4=1 Z = l 



(16) 



The probability of a specified set of occupancies Uij 
a particular j is now given by 19, 2^ 21 1: 



for 



■ j 



= Prob(n,|g„7V,0) = ^n^ 



(17) 



where nj = {nij, . 



., n/(j)j}, while x denotes the canon- 



ical framework. The multinomial factor N\/ Yil^i ''hj- 
accounts for number of permutations of models which 
attain the same set of occupancies rij. The probability 
of a specified ensemble, again assuming J independent 
components, is thus given by the "multi-multinomial" : 



constituent models. In contrast to the microcanonical 
framework, this allows models of greater-than-average 
total energy E > (E), so long as these are balanced in 
the ensemble by models of lower energy E < (E) . Com- 
bining (fT9|) with ((T6)) and ((20)) gives the Lagrangian: 



L(>^).5:{llniV!-lnG,+5:(^ln5,-lln 

1 = 1 4=1 



J 



J Hj) 



3 = 1 i=l 



N 



(21) 



where Koj and ke are Lagrangian multipliers for the 
allocation ([T5| and energy ([20)1 constraints. Extremisa- 
tion gives the non- asymptotic equilibrium solution: 



^ A ^ (In g.ij - Koj ~ HEf-ij) , Vj 



(22) 



p(x) = Prob(n|g, A^, 6>) = ]J Prob(nj |gj, A^, 6) 
J J 



(18) 



= 1 ^3 



where n is the matrix of n. 
gives the entropy: 



whence ([T]) with K ^ 



)(X) _ 



1 



In TV! - AT In G,- 



(19) 



=1 



This is subject to constraints on the occupancies, given 
by the first part of ([TC]). 

How should we analyse ensembles of models? We 
could, in the first instance, examine the set of all pos- 
sible models, of cardinality nj=i [l3- This would 
not, however, be very informative, since all models would 
a priori be of equal weight and so would not be discrim- 
inated by the MaxProb (or MaxEnt) method. The total 
ensemble also does not allow the inclusion of additional 
information about the desired set of models. If, on the 
other hand, we impose a constraint on the mean energy 
of the ensemble: 



^ ,/ Ho) 

]^EE^' 

3 = 1 1=1 



(20) 



we then impose a decision rule on its desired compo- 
sition, namely, on the average cost of constructing its 



(again all constant terms are brought into Koj)- For any 
given TV, these can be solved numerically in conjunction 
with the constraints (|T5)) and (pn|) , to give the optimum 
number of times (weighting factor) nf- that each choice 
i{j) should be included in the ensemble, subject to {E). 

When the factorials in (|T9)) satisfy Stirling's approx- 
imation, (|22[) gives the (canonical) Boltzmann distribu- 
tion at equilibrium: 



iV 



7(X) 



9ij ' 



(23) 



where Z 



ix) 



+1 



T,l=I 9^je~ is the jth 
canonical partition function. 

Example: The canonical framework can be demon- 
strated using the example described previously (IT^ . 
now constrained by a mean total energy per model of 
(E) — 17 units (less than the mean of all possible mod- 
els, (E) = 24.3904 units). The inferred asymptotic so- 
lution (1^ is identical to i.e.: 



iV 



Ko = 



= m 



In (3.062388620 A^-i) - 1 
In (7.685321125 A^-i) - 1 



In ( 3.062388620 A^" 



1 



(24) 



'* — \* 



For N = 27 (say) this gives Kq* = [-3.1766, -2.2565, 
—3.1766]^. In comparison, the non-asymptotic solution 
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(EH) at iV = 27 is: 

'0.2795 0.2232 0.2795' 
n* 0.3668 0.2940 0.3668 
N 0.3537 0.2834 0.3537 
0.1994 

= [-2.2315,-1.3292,-2.2315] 



0.1455 

(25) 



Compared to ([T^. the latter exhibits a more uniform 
distribution for each j, and is closer to the asymptotic 
form 



2.3 "Canonical" Framework II 

One difficulty with the above canonical framework is 
that it - like its microcanonical precursor - still requires 
separability of the model into J distinct components, 
for which the costs Cij are additive. In more general 
situations, this separability may not be possible due 
to coupling between components. In that case we must 
revert to a model space based on ensembles of entire 
models. Severing all connection to the components j , we 
consider a model space from which we collect a sample 
(ensemble) of M models, containing rii models each of 
total energy E^. Each energy level has degeneracy gi. 
The probability of a specified ensemble is: 



i(x) 



Prob(n|g,A/', 6) 



(26) 



where Q = Y^l^iQi- Boltzmann's equation ((T]) with 
K = Af~^ gives the entropy: 



J 

^{nAng, - lnnj)| 

1=1 



(27) 



This is subject to the occupancy and mean ensemble 
energy constraints: 



1=1 

-Y^E,n,^{E) 



(28) 
(29) 



Forming the Lagrangian and extremisation gives the 
non-asymptotic equilibrium occupancies: 



(30) 



where i^jo and lpe are Lagrangian multipliers for the 
occupancy and energy constraints. If (j27p satisfies the 
Stirling approximation, the distribution reduces to: 



1 



yix) 
'II 



5«e 



(31) 



where z\f = J\fe'^°+^ = Y.l=i 5»e~'^'=-^' is the canon- 
ical II partition function. Either ((30|) or (if valid) ((3T|) 
can be solved in conjunction with the constraints (j28l) - 
P5)) . to give the weights of the most representative 
model. 



2.4 Summary 

At this point, it is worth summarising some important 
features of the microcanonical and two canonical frame- 
works proposed: 

• As evident from the predicted solutions (fTO)) - (ITT]) and 

if one seeks the optimal model to describe 
a set of climate models, it is not necessary to com- 
pute all possible combinations of models. Using the 
MaxProb method, one can directly calculate the sin- 
gle model or a reduced set of models which best rep- 
resents the model set, subject to constraint (s) on the 
model or ensemble properties. The effect of two com- 
peting constraints is examined in the next section. 

• The microcanonical framework imposes constraint (s) 
on individual models, whereas the two canonical frame- 
works impose constraint(s) over ensembles of models. 
The latter enable the synthesis of larger sets of mod- 
els. 

• Note that, due to the assumed independence of the J 
model components, the microcanonical and canonical 
I frameworks are "multi- multinomial" ([5]) and (IT8)) . 
The choices for a specified j = 'd are thus inde- 
pendent of the other choices j ^ d. The MaxProb 
prediction can therefore be computed using individ- 
ual models composed of whichever choices are 
convenient, so long as the overall set conforms to the 
MaxProb prediction. In the canonical II model, we 
overcome the difficulty of coupled model components 
by considering ensembles of entire models, with con- 
straints on the total energy of each model. 

• How should we interpret the Lagrangian multipliers 
on the energy constraint? By analogy with thermo- 
dynamics, these can be interpreted as \e — l/kT^'^\ 
KE = and (ps = l/kT^f, where the T pa- 
rameters are framework "temperatures" and fc is a 
constant with units of energy (or cost) per tempera- 
ture unit. The T's are not thermodynamic tempera- 
tures, but express the distribution of energy over the 
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available energy levels, in the relevant model or en- 
semble space. In effect, they serve as proxy variables 
for the total model cost E or mean ensemble cost {E) . 

• Although the MaxProb framework is primarily de- 
signed to determine the most probable (maximum en- 
tropy) model, it is also possible to interrogate the La- 
grangian to determine the minimum entropy model(s), 
i.e. those which lie farthest from the optimum. In 
this manner, one can explore the extremities of the 
model or ensemble space, to identify model outliers. 
Since minimum entropy solutions tend to lie on non- 
continuous boundaries of the solution domain, they 
are generally inaccessible to extremisation methods 
[s^l; nonetheless, they should be identifiable by nu- 
merical optimisation algorithms such as simulated an- 
nealing. 

• The mathematical structure of the output from the 
MaxEnt algorithm gives rise to many more features 
of the predicted solution. Some of these features are 
explored in later sections. 



3 "Canonical" Framework II with Cost and 
Benefit Constraints 

3.1 Derivation 

We now examine a more comprehensive canonical II 
framework, in which we impose two constraints at the 
ensemble level: a constraint on the mean ensemble cost 
or energy (E), as before ((29)) . and also a constraint on 
some measure of the average ensemble "worthiness" or 
"benefit" (B) (for example, a measure of its precision 
or accuracy). In this manner, we construct a MaxProb 
framework with which to conduct cost-benefit analyses 
of various ensembles of models, and to interrogate the 
trade-off between costs and benefit^ In general, the 
energy and benefit levels will have different ranks, ne- 
cessitating the use of different indices i G {1, ...,/} (as 
before) and £ € {!,...,£}. We therefore consider model 
choices ranked by total model energies En and bene- 
fits Bii, of joint degeneracy ga. The probability that 
an ensemble of J\f models has the occupancies {nu} is 
governed by the multinomial: 



(32) 



i=ll=l 



where now Q = X)f=i Sfci 5*^ (^^^ convenience we drop 
the super- and subscript labels). From ([T]) with K — 

^ This approach is applicable not only to climate models, but 
models of any type, including economic models. 



M ^, we maximise the entropy: 

I £ 

+ ^ ^ {nu In gu - In n^^^ ! ) | 
1=1 1=1 

subject to the constraints: 
1 ' ^ 

i=l 1=1 

1 ^ ~ 

1=1 1=1 

1 ^ ^ 

i=l 1=1 

to give the non-asymptotic equilibrium solution: 
n* = A~'^{\xigu - loq ~ ojeEu - (^bBii) 



(33) 



(34) 
(35) 
(36) 

(37) 



where wq, and ujb are the Lagrangian multipliers. 
If Stirling's approximation applies, the entropy is: 



(38) 



whence extremisation gives: 



M ~ Z 



1=1 i=i 



Hi e 



(39) 



where Z is the partition function. 

The Lagrangian multiplier ue can again be inter- 
preted as an inverse ensemble temperature loe = ^/kT, 
where A: is a constant. The multiplier ujb can be consid- 
ered as a measure of the overall benefit provided by the 
ensemble, in reciprocal benefit units. In effect, it acts 
as a proxy variable for the mean benefit (B) . Since (B) 
measures the average information or value provided by 
the framework, it can be interpreted very crudely as 
a reciprocal density or volume, whereupon we can in- 
terpret UJB = P/kT, in which P is a mean ensemble 
pressure (this interpretation should not be taken too 
seriously). 



3.2 Jaynesian Mathematical Structure 

Now that we have the main results, we can examine 
several important mathematical features of the solu- 
tion. Most of these were reported in a generic context 
by Jaynes 0, H, @] (see also Kapur & Kesavan 34 1 
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and Tribus [35|), although many were previously known 
in thermodynamics. The foregoing microcanonical and 
canonical I and II frameworks also exhibit these fea- 
tures, but it is more interesting to examine the effect of 
two competing constraints. 

Firstly, for the Stirling-approximate case, substitu- 
tion of into ([551) , by sorting into expectations along 
the lines of gives the asymptotic maximum entropy: 



-InG -(t) + uJE{E) +lub{B) 



(40) 



where for convenience we define the potential function 
(negative Massieu function) 4> = —loq = —hiZ. The 
most probable state of the ensemble is thus given by 
a constant term, plus the Massieu function, plus the 
sum of products of the constraints and their conjugate 
Lagrangian multipliers. 

Since the entropy function and constraints are state 
variables on the space of ensembles of models, (gH]) pro- 
vides a linear homogenous equation which describes the 
frameworljl. This can be used to examine the response 
of the framework to changes in the constraints and/or 
multipliers. For constant G and (f) we immediately see 
that 0,0,11): 



d{E) 



d{B) 



(41) 



Second differentiation gives the Hessian matrix: 



—a = 



6)2 j^* 



d{E)d{B) 



_d{B)d{E) d{Bf 



d{E) d{E) 

du! E doj B 
I{B) d{B) 



(42) 



If the mixed derivatives are equivalent (i.e. f)* is con- 
tinuous and continuously differentiable, at least up to 
second order) , this gives the reciprocal or Maxwell- like 
relation dQ: 



dujr 



duJF 



d{B) d{E) 



(43) 



Equivalently, (HUl) can be rewritten as a function of the 
potential 0, whence it can be shown that 0, B @| : 



duj 



E 



= {E), 



B 



= {B) 



(44) 



^ Strictly, if the initial terms in G and if) are constant, the dif- 
ferential of 1 140 II is a linear homogeneous first-order differential 
equation. Absorbing the constant into if>, II40I I can then be inter- 
preted as an Euler equation [c.f. |36|. 



Second differentiation gives: 
du!^ duJEdcuB 



a = 



d{E) d{B) 

du! E du! E 

d{E) d{B) 

dujB dojB 



(45) 



dujBdujE 

giving, again for equivalent mixed derivatives [ 

d{B} _ d{E) 
du! E du! B 

From (gl]) and (gS)), it is evident that: 
, -1 



(46) 



(47) 



This defines a Legendre transformation between j^* and 
representations of the system 0, [s^l . 

Finally, we note that it may be desirable to rank 
climate models by more than two properties, e.g. the 
model cost E and several different benefits Bi, B2, Bm ■ 
The foregoing analysis can readily be extended into as 
many dimensions as desired, giving the above mathe- 
matical structure as a function of the constraints (E) , 
and (Bi), (Bm). 



3.3 Implications 

What are the implications of the above Jaynesian math- 
ematical structure? In essence, it governs the effect of 
changes to the constraints and/or multipliers on the 
manifold of equilibrium positions of the framework. This 
includes: 



Firstly, the first derivatives (1411) and (|44)) can be in- 
terpreted as equations of state on the space of ensem- 
bles of models, describing the relationship between 
the rate of change of the entropy or potential as a 
function of the constraints or their conjugate multi- 
pliers [37[. 

Secondly, the second derivatives (H^ and (|45l) de- 
scribe the susceptibilities of the framework, i.e. the 
functional connections between the constraints and 
multipliers. In thermodynamics, such susceptibilities 
include the heat capacity, isothermal compressibil- 
ity, coefficient of thermal expansion and so on [e.g. 



271 131L l36l . |38|; if desired, such parameters can also 



be defined for the model framework proposed here. 
The Maxwell-like relations (gH]) and (gH) reflect the 
coupling between the constraints, such that changes 
in one constraint or its multiplier, at constant f)* or 
0, will produce adjustments to the other pair. 
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• Thirdly, the second derivative matrix (HS]) of the po- 
tential function 4> contains even further information, 
since in the asymptotic limit {N — oo), it is equiva- 
lent (with change of sign) to the variance-covariance 
matrix of the constraints 0, @, @, H^l : 



a = 



■ {E')-{E)^ {EB)-{E){B)- 
{EB)-{E){B), {B^)-{By . 



(48) 



Accordingly, a is positive definite (or semi-definite 
if singularities exist) 3J]. From the Legendre trans- 
formation (HT)) . a is also positive definite (or semi- 
definite) j34i] . In consequence, from (|^^ and (|45p (in- 
cluding the tensor sign reversals), Sj*{{E), (B)) and 
(f)(itJE,'-^B) are both concave functions. Furthermore, 
the diagonal of gives the magnitude of the stan- 
dard deviation or ''fluctuations" of the ensemble with 
respect to each constraint, usually expressed in nor- 
malised form by the coefficients of variation j36| : 



CviE) 



Cv{B) 







(E) 




{BY 




d{E) 
dujE 

'd{B) 



(49) 



(B) (B) V dujB 

The covariance, similarly normalised, provides a mea- 
sure of the coupling between constraints 



CviE,B) = 



'{EB)-{E){B) 



{E){B) 



{E){B) 



d{B) 



dujf 



{E){B) 



d{E) 



(50) 



dujf 



• Fourthly, the manifold of predicted equilibrium po- 
sitions defined by Sj*{{E), (B)) or cj){u}E,uJB) can be 
interpreted as a framework geometry^ analogous to the 
thermodynamic geometry examined by Gibbs 39 , 4^, 
l4l| (see also 36, 42, H^). For example, if we con- 
sider (B) as a function of (E), as shown graphically 
in Figure |3l we can represent positions of constant en- 
tropy Sj* hy a series of isentropic curves on this graph. 
From (HO]), these will be straight lines with negative 
gradient —uje/i-^b, indicating that an increase in the 
energy or cost (E), at constant S^* (and </)), causes a 
corresponding decrease in (B). Of course, many other 
curves can also be plotted on the diagram, including 
isoenergetic, isobenefit, iso-uje and Iso-wb curves, de- 
fined by rearrangements of ([40]). We can also plot 
as a function of cj^, on which we can construct 
isopotential curves with negative gradient — {E)/{B). 
(Adopting the crude analogy of H3.ll these can be 
transformed to plots of P as a function of T for the 



model framework.) Three-dimensional graphs such as 
^j*{{E),{B)) or (j){ujE,0JB) can also be constructed, 
containing isosurfaces of various kinds [i^ 41 1. As 
pointed out by Gibbs [3^0,|4l|, it is advantageous to 
plot "fundamental equations" such as S^*{{E), (B)) or 
(/)(a;£;, wb), rather than forms unobtainable from these 
by Legendre transformation (such a,s ^* {uj e , ^ b)) , so 
that all parameters not represented on the axes can be 
calculated for a given path simply by differentiation. 

Recalling that the frameworks herein consist of all 
possible models consistent with the constraints, the 
resulting manifold f)*{{E), {B)) or (f>{ujE,(^B) should 
for the most part be continuous in its geometric space, 
reflecting infinitesimal changes in parameters and in- 
cremental changes in model algorithms. However, in 
some circumstances there may be discontinuities in 
the manifold, due to abrupt changes in model algo- 
rithm or adoption of different scientific paradigms. 
Such changes can be described as phase changes or 
tipping points within the model space, leading to as- 
sortments of stable and unstable solutions and path- 
dependent hysteresis effects. These may create partic- 
ular difficulties, but can of course be handled in much 
the same manner as in thermodynamics. 

Finally, it can be shown that either framework ^*{{E), (B)) 
or (f)(u!E,ujB) can be endowed with a Riemannian ge- 
ometry (entirely distinct from the framework geom- 
etry just described), using the metric furnished by 
the respective (positive definite) Hessian matrix a or 
a 



22 



23|, IM |25|, |26|, 127|. As noted, the two metrics 
and hence the geometries are connected by Legendre 
transformation (I47p . The Riemannian interpretation 
leads to an important physical limit: a least action 
bound on the cost, in units of i^* or 0, to move the 
framework from one equilibrium position to another 
at finite rates of change of the constraints or multi- 
pliers. This bound - which constitutes an extension 
of finite time thermodynamics [iil. [ij. [illisl. lielliTt ■ 
but is in some sense allied to the informational limits 
identified by Szilard^3, Landauer i45f], Bennett 46 1 
and similar workers [47| - is examined in more detail 
in Appendix A. 



4 Conclusions 

In this study, several maximum-entropy frameworks are 
presented for the synthesis of outputs from multiple 
Earth climate models, based on constraints on the prop- 
erties of individual models (microcanonical framework) 
or ensembles of models (two canonical frameworks). 
The asymptotic and non-asymptotic entropy functions 
for each case are derived by combinatorial reasoning, 
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(B) 



Isentropic lines 



Isenergetic lines 



-Iso- 
benefit 
^lines 



Fig. 3 Schematic diagram of Gibbs' geometry, for the MaxEnt 
cost-benefit climate model weighting frameworlc of ij3] 



and applied to simple systems constrained by the to- 
tal model energy E (microcanonical) or mean ensem- 
ble energy (E) (canonical). In each case it is shown 
that the MaxEnt method identifies the most represen- 
tative (most probable) model from a set of climate mod- 
els, subject to the specified constraints, eliminating the 
need to calculate the entire set. The parametric and ge- 
ometric implications of the underlying Jaynesian math- 
ematical structure are examined, with reference to a 
canonical framework with competing cost and bene- 
fit constraints, allowing interrogation of the trade-off 
between costs and benefits. Finally, a finite-time limit 
on the minimum cost of modification of the synthesis 
framework, at finite rates of change, is also reported. 

The foregoing analysis therefore provides climate 
modellers - or those who must rank and combine cli- 
mate models - with a rational tool to amalgamate a 
large set of models into a single representative model (or 
a small representative set). This enables the weighting 
of climate projections from different groups, and will 
also dramatically reduce the computational demand on 
the climate modelling community. Indeed, the benefits 
extend into other fields: as commented by a reviewer, 
for long-range weather forecasts it is common practice 
to combine projections from different meteorological 
models, to improve reliability. The MaxEnt frameworks 
proposed here could equally be applied to this task. 

A caveat to the foregoing analysis is that the in- 
ferred equilibrium climate model will not necessarily 
be the "most correct" model, but merely the one which 
is most representative of the available set of models. If 
the model space is incomplete, or their underlying phys- 
ical or modelling assumptions are incorrect, any result- 
ing errors will also be incorporated in the equilibrium 
model. A more comprehensive probabilistic framework, 
which incorporates the errors associated with our lack 
of knowledge (of data, phenomena and models), would 
consist of a Bayesian inferential framework extending 
back to all raw climate data, a substantial endeavour 
which - as its minimum condition - would require cli- 



mate scientists to abandon their use of orthodox meth- 
ods for statistical inference and parameter estimation 



Acknowledgements This work was inspired by discussions at 
the Mathematical and Statistical Approaches to Climate Mod- 
elling and Prediction workshop, Isaac Newton Institute for Math- 
ematical Sciences, Cambridge, UK, 11 Aug. to 22 Dec. 2010. The 
author sincerely thanks the workshop organisers for travel sup- 
port. 



Appendix A: The Least Action Bound 

The Riemannian geometric interpretation in ij3. 31 leads 
to a rather curious physical limit. Consider a path on 
the manifold of equilibrium positions defined hy Sj*{ (E) , {B) ) 
or (f>{Lu E , 1-^ b) , specified by some path parameter ^ in the 
model space, which may - but need not - correspond 
to time. The arc length of the path from position 1 to 
2, represented by f = to ^ = ^max, is given by [27| : 



(51) 



where f = [(E), {B)]^ , ft = [wbjLJb]^ and the overdot 
indicates the rate of change with respect to Now, in 
the ^3* representation, the total change in the frame- 
work entropy along the same path can be shown to be 
13: 



j d^-)* = 1 j i F a f = 



(52) 



where e is a mean dissipation parameter (e.g. minimum 
dissipation time) and J is an action integral defined 
within the model space. Similarly, in the 4> representa- 
tion, the total change is: 



2 s 



-A(f) = - 



From the Cauchy-Schwarz inequality, ([51 
cither case: 

1,2 

J> 



(53) 



53|) give, in 



2e„ 



(54) 



Eq. (j54p can be considered as a generalised least action 
hound on processes on the manifold of optimal solutions. 
In essence, it specifies the minimum cost or penalty, in 
units of io* or (f), to move the system from ^ = to 
C = S,max at the specified rates f ov tl. If the process 
occurs infinitely slowly, the lower bound of the action is 
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zero (it is "reversible"); otherwise, it is necessary to pay 
the minimum penahy AS^^^n = -^0mm = eJmm = 
ieL^/^maa; to be able to alter the framework within the 
finite parameter duration ^rnax (it is "dissipative" ) . In 
the present scenario, we assume that the costs (E) and 
benefits {B) of the model framework are realisable as 
external physical quantities, outside the model space 
itself; likewise, so will be the entropy ^* and potential 
0, either in the units of k or the equivalent informa- 
tion units. Eq. (j54p therefore provides an information 
limit on the minimum price for making alterations to 
a constrained modelling framework. (Of course, it ap- 
plies to any modelling framework, not just for climate 
modelling.) In some sense, this limit is allied to the 



informational principles demonstrated by Szilard 44 1 
Landauer [i^, Bennett 46 1 and many others fivl, al 
though it is of quite different character. 
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